The rapid growth of digital libraries and online reading platforms has significantly increased the difficulty of identifying books that align with individual reader preferences. As modern platforms host millions of titles, users often experience decision fatigue when attempting to discover relevant content. Conventional recommendation approaches such as popularity- based rankings and static genre filtering fail to capture nuanced user interests and frequently produce repetitive or overly generic suggestions. This paper presents LibRaX, a scalable personalized book recommendation system based on collaborative filtering using latent factor models. The system leverages large-scale user–book interaction data and applies Alternating Least Squares (ALS) to learn compact latent representations of books from sparse rating matrices. These learned item embeddings capture collective user preference patterns beyond surface-level metadata. To enable efficient similarity search over hundreds of thou- sands of item embeddings, LibRaX integrates Facebook AI Similarity Search (FAISS) for approximate nearest neighbor retrieval, avoiding the computational and memory limitations of traditional brute-force K-Nearest Neighbor approaches. The FAISS index is constructed offline and loaded into memory during inference, enabling real-time recommendation generation. LibRaX is implemented using a Python-based backend that performs sparse matrix construction, offline model training, FAISS indexing, and real-time recommendation inference, along- side a Kotlin-based Android application that serves as the user in- terface. The system follows a stateless backend architecture with periodic batch retraining and similarity-based retrieval. This pa- per discusses the motivation behind the system, reviews relevant literature, and outlines the collaborative filtering methodology and architectural design decisions that enable LibRaX to operate efficiently at scale.
Introduction
Digital reading platforms offer vast catalogs of books, but users face information overload and difficulty finding relevant titles. Traditional discovery methods—bestseller lists, editorial curation, or genre categories—lack personalization, often emphasizing popularity over individual preferences.
Collaborative filtering addresses this by analyzing user behavior to predict preferences. Latent factor models extend traditional item–item similarity by mapping users and items into a lower-dimensional embedding space, improving robustness to sparse interaction data. However, large-scale similarity computation is challenging due to memory and computational costs.
LibRaX solves this using FAISS-based approximate nearest neighbor (ANN) search, efficiently retrieving similar items from embedding vectors. The system comprises a Python backend for data preprocessing, sparse matrix construction, and recommendation inference, and a Kotlin Android app for user interaction. The backend is stateless, preserving privacy, while the mobile app provides personalized recommendations in real time.
LibRaX’s pipeline includes: preprocessing user–item ratings, constructing a sparse interaction matrix, factorizing it via Alternating Least Squares (ALS) to generate latent embeddings, indexing embeddings with FAISS, and returning top-ranked book recommendations. This design ensures scalable, memory-efficient, and personalized book discovery across large digital libraries.
Conclusion
This paper presented LibRaX, a scalable personalized book recommendation system based exclusively on item–item collaborative filtering and FAISS-based approximate nearest neighbor search. The system was designed to address key chal- lenges associated with large-scale recommendation, including memory constraints, inference latency, and deployment feasi- bility.
By leveraging FAISS for efficient similarity retrieval and adopting a stateless backend architecture, LibRaX achieves real-time recommendation generation without requiring dense similarity computation or frequent model rebuilding. The case study and qualitative analysis demonstrate that the system adapts effectively to user interaction data and provides co- herent recommendations throughout different stages of user engagement.
Future work may explore the integration of lightweight content-aware signals, incremental index updates, and large- scale online evaluation to further enhance recommendation coverage and robustness. Nonetheless, LibRaX demonstrates that FAISS-based collaborative filtering offers a practical and scalable foundation for modern digital reading platforms.
References
[1] B. Sarwar et al., “Item-based collaborative filtering recommendation algorithms,” in Proc. WWW, 2001.
[2] Y. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” IEEE Computer, 2009.
[3] R. Burke, “Hybrid recommender systems: Survey and experiments,” User Modeling and User-Adapted Interaction, 2002.
[4] J. Johnson, M. Douze, and H. Je´gou, “Billion-scale similarity search with GPUs,” IEEE Transactions on Big Data, 2019.
[5] A. Andoni et al., “Practical and optimal LSH for angular distance,” in Proc. NIPS, 2015.
[6] M. Gomez-Uribe and N. Hunt, “The Netflix recommender system,” ACM TMIS, 2016.
[7] G. Shani and A. Gunawardana, “Evaluating recommendation systems,” Recommender Systems Handbook, 2011.